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ABSTRACT 

We study the mass spectrum of sub-structures in the Perseus Molecular Cloud Complex traced by '-^CO (1- 
0), finding that dN/dM cx M~^-* for the standard Clumpfind parameters. This result does not agree with the 
classical c/A^/c/MocM"'-*. To understand this discrepancy we study the robustness of the mass spectrum derived 
using the Clumpfind algorithm. Both 2D and 3D Clumpfind versions are tested, using 850 dust emission 
and '^CO spectral-line observations of Perseus, respectively. The effect of varying threshold is not important, 
but varying stepsize produces a different effect for 2D and 3D cases. In the 2D case, where emission is rela- 
tively isolated (associated with only the densest peaks in the cloud), the mass spectrum variability is negligible 
compared to the mass function fit uncertainties. In the 3D case, however, where the '^CO emission traces the 
bulk of the molecular cloud, the number of clumps and the derived mass spectrum are highly correlated with 
the stepsize used. The distinction between "2D" and "3D" here is more importantly also a distinction between 
"sparse" and "crowded" emission. In any "crowded" case, Clumpfind should not be used blindly to derive 
mass functions. Clumpfind's output in the "crowded" case can still offer a statistical description of emission 
useful in inter-comparisons, but the clump-list should not be treated as a robust region decomposition suitable 
to generate a physically-meaningful mass function. We conclude that the '^CO mass spectrum depends on the 
observations resolution, due to the hierarchical structure of MC. 

stars: formation — ISM: molecules — ISM: individual (Perseus molecular 



Subject headings: ISM: clouds ■ 
complex) 

L INTRODUCTION 

Molecular clouds (MCs) have usually been studied using 
'^CO and '^'CO (1-0) transition line maps, because they trace 
low-density material. When these observations are used the 
emission comes from the whole MC, and the emission is 
"crowded". They also provide information about the velocity 
structure of the cloud, and we refer them as 3D data. Thanks 
to the new generation of bolometers, large scale dust emis- 
sion maps of entire MCs are now pos sible (Motte et al. 1998; 
Hatchell et al."2005'; iJohnstone etani2004t iKirk et ah i2006t 
Enoch et al. 2006). But, due to the observing technique much 
of the emission on large-scales is removed, obtaining a map 
with "sparse" emission. These dust emission maps are mostly 
used to find the densest objects in a MC: dense cores. How- 
ever, these data do not provide velocity information to asses 
if a core is bound or not; and we refer these data as 2D. Ex- 
tinction maps provid e another tool to study MCs and dens e 
cores using 2D data ( ICambresvlll9"99l iLombardi et al.ll2006l) . 
These maps give an estimate of the total column density in a 
region hence capturing the large-scale structures in MCs, and 
therefore the map is crowded. However, thanks to some post- 
processing techniques the extended emission can be removed, 
finally obtaining a map with just sparse emission Alves et alJ 
(|2007|) . Finally, by observing molecular lines with higher crit- 
ical density or with an interferometer most of the large-scale 
structure is not traced, obtaining a 3D data set but with sparse 
emission. 

The structure of MCs has been studied using a variety of 
decomposition algorithms on '^CO and/or '^CO (1-0) emis- 
sion maps (e.g. IStotzki & Guestenll 19901: iKramer et al.ll 19981: 
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IWilliams eta!]|1994[l995h . Algorithms that decompose the 
MC typically take all the emission (above some threshold) 
and split it into clumps, which can later be easily used to cal- 
culate a mass function. In such studies, it has been shown 
that the mass functi on of clump s follows a power-law with 
dN/dM oc M-' <'±°-2 (|Blitzlll993h . 

One of the most widely used cloud decomposition algo- 
rithms is Clumpfind (Williams et al. 1994), since it is readily 
available and has only two user-controlled parameters. It was 
designed to study a whole MC using 3D molecul ar line data in 
a systematic fashion dWilhams et aljl994lll995h . and the data 
historically had coarse angular resolution allowing the study 
of only the largest structures in the cloud. Clumpfind has also 
been modified to handle 2D data with sparse emission, and 
successfully applie d to study the core mass function (CMF , 
e.g. Johnstone e tli]|20q4l; iKirk et alJl2006l;lAlves et alj|2007l; 
iReid & Wilson.i2006bllall2005h . and to compare that CMF to 
the Initial Mass Function (IM F) of stars wh i ch appears as 
an almost invariant power-law dSalpeteiil 19551 ; iMuench et alj 
l2000;.Ki-oupa 2001.) : dN/dM - M'^-^^ for M > 0.6 Mq. In 
just a few cases molecular line data from a higher density 
tracer h as been used to study dense cores ("I keda et al] 120071 ; 
IWalsh e t al. 2007), adding velocity information to the mostly 
sparse emission. 

In this work, we study the robustness of the mass spec- 
trum derived using Clumpfind in crowded ('^CO (1-0)) and 
sparse emission (SCUBA 850/im). By using the '''CO (1-0) 
and SCUBA 850/im data collected by the COMP LETE team 
in the P erseus Molecular Cloud Complex (Ridge et alj|2006l 
iKirk et al. 2006)^^ we are able to study the algorithm in both 
its 3D and 2D versions on real data sets (with overlapping sky 
coverage) with high resolution and sensitivity. 

3 All of the data from the COMPLETE (Coordinated Molecular 
Probe Line Extinction Thermal Emission) Survey are available on-line at 
|http : / /www ■ cf a ■ harvard ■ edu/COMPLETE| 
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2. DATA 

We use the '^CO (1 -0) molecular line map obtained by the 
COMPLETE Survey (Ridge et al. 2006) using the SEQUOIA 
32-element focal plane array at the FCRAO telescope. Obser- 
vations were carried out using the on-the-fly technique. The 
data cube covers an area of ^6.25° x 3° with a 46" beam on a 
23" grid, and it is presented in 7^* scale. 

The map is beam sampled and Banning smoothed in ve- 
locity. The final pixel size is 46" and the velocity resolu- 
tion is Av = 0.066 kms~'. The median Root Mean Square 
(RMS) noise in the map is 0. 1 K in 7^*, and all positions with 
RMS > 0.3 K are removed from the map. In addition, a noise- 
added '^CO cube (with RMS=0.2 K) is also used. 

We also use the 850 nm map obtained with SCUBA on 
the JCMT dKirket al.ll2006h . The pixel scale is 6", while 
the effective beam is 19.9". The mean RMS in the map 
is ~0.06 Jybeam"'. The map coverage is smaller than the 
'"'CO data, but it covers the densest regions in the cloud, 
where the dense cores ide ntified by SCUBA lie jKirk et al.l 
|2006; Hatc hell et al.ll2005l) . It is important to note that in the 
SCUBA map any structure larger than ^2' is removed during 
the data reduction process (in addition to the observational 
problems of detecting extended structure with bolometers), 
making the map mostly devoid of extended emission. Hence 
the SCUBA map is substantially different from the '^'CO data, 
because the later traces the more extended material. Small ar- 
eas near the SCUBA map's edge are removed to avoid some 
image artifacts. 

3. CLOUD STRUCTURE 
3.1. Clump Identification 

Clumpfind needs only two parameters (threshold and step- 
size) to decompose the emission onto a set of clumps. The 
threshold parameter sets the minimum emission required to 
be included in the decomposition, while the stepsize defines 
how finely separated the iso-surfaces (or iso-contours) are 
drawn by the algorithm in order to check for structures. In 
other words, threshold sets the number of pixels included 
in the decomposition, and stepsize sets the contrast needed 
between two feature s to be identified as different objects. 
I Williams et al.l (119941) suggest using a threshold and stepsize 
value of 2-cr, where cr is the noise in the data. Clumpfind as- 
signs all the emission above the given threshold into clumps, 
and it can be applied to both 2D and 3D data sets. 

For this analysis, different values for both parameters 
are used to test the robustness of the results derived using 
Clumpfind. In the 3D case ('^CO), the threshold is set to 3-, 
5- and 1-a for the original data and 5-cr for the noise-added 
one; while the stepsize is varied between 3- and 2Q-(t with an 
spacing of Q.5-a. In the 2D case (SCUBA), the threshold is 
set to 3-, 5-, and 7-a; and the stepsize is varied between 2- 
and 16.5-(J, with a spacing of O.S-cr. 

Some stepsize values seem unusually large, but given 
the improvement on the data available larger stepsizes 
are required to ident ify the largest structures in MC (see 
iRathborne et"ani2009l) 

3.2. Mass Estimate 

We adopt the conversion between '-'C O integrated intensity , 
WC^CO), and extinction. Ay, derived bv lPineda etal] (l2008t : 

= 0.350 W('^^CO) . (1) 
This conversion is derived for Perseus using the COMPLETE 
extinction map and FCRAO data (assuming a main-beam ef- 



ficiency of 0.49). To convert from visual extinction to column 
density, we assume that t he ratio between N jH) and E{B - V) 
is 5.8 X 10^1 cm-^mag-i (IBohlin et al.ll 19781) . andT^v = 3.1. 

For the dust continuum emission, we assume it is optically 
thin, 

Ms50 = 0.485850 f nn7''''°2 -i ) ^0 ' (2) 
\ 0.02 cm^ gr ' / 

where ^350 is the flux at 850 ^m, k^so is the opacity at 
850 jum, and we assume a dust opacity of 0.02 cm^gr"' 
dOssenkopf & He nning"1994'), dust temperature of = 15 K 
and a distance to Perseus of 250 pc. These adopted values are 
the same used bv FKirk et alj (|20()6|) . 

3.3. Completeness limit 

For the '■'CO data, the completeness limit is estimated by 
comparing the derived mass and radius of each clump and 
a sensitivity curve. The sensitivity curve is estimated as the 
smallest realistic clump that can be found by Clumpfind given 
a radius (red dash line in Figure[T]). The completeness limit is 
then estimated as the largest mass below this sensitivity curve 
(for each Clumpfind run), i.e. the mass where data and red 
line merge in Figure [T] and shown with red filled circle. Here, 
a minimum size of 3 velocity channels (Av) is assumed, and 
given that the brightness in each pixels must be larger than the 
threshold: 

M™.„(/.) = 0.09i(^)(^)(|)m,, (3) 

where ^ is the threshold. In Figure [T] two Clumpfind runs 
are shown: original and noise-added '''CO. In both cases the 
possible change in slope of the mass function happens close 
to the completeness limit shown by the arrow. 

For simplicity, we use a single completeness limit for each 
dataset, a value larger than the completeness limit estimated 
for any of the individual Clumpfind runs: 4 and 3 Mq for 
the original and noise-added '^CO, respectively. The com- 
pleteness limit for the objects i dentified in the SCUBA map 
is estimated by Kirk et alj (l2006i) as 0.6 Mq, not as the mass 
where the mass function changes, but as the object that would 
be missed given the typical size of the cores found. 

4. RESULTS 

The simplest comparison between different Clumpfind runs 
is how many clumps are defined. In panels a and b of Figure|2] 
we show that the total number of clumps identified in each 
run (filled circles) decreases when increasing the threshold or 
stepsize, in either 2D or 3D. This decrease is not a surprise, 
because with a higher threshold there are fewer pixels avail- 
able, and therefore a smaller volume to define clumps; in the 
case of the stepsize, a larger stepsize can miss some real struc- 
ture, but also small stepsize can identify spurious clumps from 
structure due to noise (i.e. split a single clump into two or 
more because the noise creates fake structure above the step- 
size level). However, Clumpfind runs with thresholds of 3-cr 
can identify twice as many clumps as runs with higher thresh- 
olds and the same stepsize (see panels a and b in Figure |2]), 
while runs with 5- and 7-a thresholds follow a similar curve 
(with the runs of lower threshold still finding more clumps 
as expected) for a given data set. It is important to note that 
objects identified in '^CO are not necessarily bound. 

Despite the difference in the total number of clumps, the 
number of clumps above the completeness limit (shown as 
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Mass (M^) Radius (pc) Radius (pc) 

Fig. 1 . — Compaiison of clump mass functions for original and noise-added '^CO data. Left panel shows the change in mass function of two Clumpfind rans 
using the same threshold and stepsize (in a units), but with different noise levels. Arrow shows the corresponding completeness limit. Middle and right panels 
show the mass and radius for each identified clump. Red dashed line shows the sensitivity curve, and the red filled circle shows the sensitivity limit estimated for 
each Clumpfind run. The possible mass function turnover is close to where the identified clumps cross the sensitivity curve. 
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Fig. 2. — Summary of all Clumpfind runs as a function of stepsize. Color represent different Thresholds: blue, red, and green for 3-, 5-, and V-cr, respectively; 
we also show in orange results with Threshold of S-cr for "CO data with added noise. Left and right columns show results for "CO and SCUBA data, 
respectively. Panels a and b show the number of clumps under a given category per model. Total number of clumps found, and total number of clumps with 
mass larger than the completeness limit are shown in open diamonds and filled circles, respectively. Panels c and d show the exponent of the fitted mass spectrum 
of clumps above the completeness limit, dN/dM oc M^", with eiTor bars estimated from equation |6] Horizontal black lines show some fiducial exponents for 
comparison. Average noise in: '^CO, "CO with added noise and SCUBA data is 0. 1 K, 0.2 K and 0.06 Jy beam ' , respectively. Completeness limit is estimated 
to be 4 Mq, 3 Mq and 0.6 Mq for '^CO, "CO with added noise and SCUBA data. Panel c also shows that for different noise level in the data, if a threshold of 
~2 K (20- and lO-cr for original and noise-added data, respectively) is used, then the fitted power-law exponents are closer to previous works. 



filled circles in panels a and b of Figure |2]i is comparable be- 
tween different Clumpfind runs. Not only between different 
thresholds, but also when changing stepsize. However, the 
number of clumps above the completeness limit is usually less 
than half the total number of clumps, and therefore, most of 
the identified clumps are not even considered in mass function 
analysis. 

The differential mass function, dNd/dM w ANd/AM, 
is usually approximated by a power-law, dNd/dM cx M~". 
However, if the data are binned, then variations in the fit- 
ted power-law exponent are generated by changing the bin 
width and shifting the bins (Rosolowskv 2005); and when an- 
alyzing the cumulative function special care must be taken to 



avoid the undesired effects of truncation ('M unoz et al.ll200'7l: 
Li et al. 2007). To avoid both problems we perform a fit 
of the differential mass function, but without binning the 
data, by using the M aximum Likelihood Estimate (MLE, see 
IClausetetal.ir2007h . 

We fit the following function. 



dN a- 
dM 



1 



M 



(4) 



where M,„,„ is the minimum mass of the sample to be used 
in the fitting, Nd is the number of clumps more massive than 
M,„,„, and a is the power-law exponent of the distribution. 
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Using MLE, the exponent is estimated by 

M r' 

a=\+N,, ^In— ^ , (5) 

and the standard error on a is approximated by 

a-\ 

Oa = —7= ■ (6) 

This estimate can be regarded as a lower limit in the uncer- 
tainty, because it does not take into account uncertainties in 
the mass measurements. For this work we use M,„,„ equals to 
the completeness limit. 

The exponent, a, estimated for every Clumpfind run is 
shown as a function of stepsize in panels c and d in Fig- 
ure |2] for '^CO and SCUBA, respectivel y. The deri v ed val - 
ues for the clump mass spectrum from iLada et aT l (Il991j); 
IStutzki & Guested (Il990h (using different methods and in dif- 
ferent regions) and Salpeter's exponent (for the IMF) are also 
shown for comparison. For a standard stepsize of 3-cr, the 
'■'CO clump mass spectrum is similar to the IMF, and steeper 
than the values derived by previous works. 

An interesting result is that the clump mass spectrum agrees 
(within the uncertainties of the fit) for different threshold val- 
ues used if the same stepsize is used. However, most impor- 
tant is the fact that the estimated power-law exponent, a, is 
correlated with the stepsize. This variation in a can be as 
high as 40%, and the correlation appears in both versions of 
Clumpfind: 3D and 2D. For 2D Clumpfind, we find that this 
correlation is negligible compared with the uncertainties as- 
sociated with the fitted power-law exponent. 

5. DISCUSSION 

From Figure 12] we can clearly see that the power-law ex- 
ponent fitted to the decomposition done by Clumpfind of the 
'^CO data is strongly correlated with the stepsize, and there- 
fore not unique. In fact, our results show that Clumpfind is 
not very useful to identify small structures within a map, un- 
less they are isolated. The reason for this correlation between 
fitted power-law and stepsize is that for a small stepsize less 
contrast is required to identify the structure, generating more 
but smaller objects and therefore having a steeper mass distri- 
bution; this effect is more important in crowded regions (see 
FigureO. Despite the fact that the previous conclusion seems 
obvious, the amount of variation in the fitted exponent has 
typically been deemed neg l igible . An independent analysis 
carried out by ISmith et alT (l2008h on numerical simulations 
also found different results from Clumpfind when investigat- 
ing the effect of the data resolution on the Clumpfind analy- 
sis. ISmith et aP (2008) "observe" a numerical simulation us- 
ing different spatial resolutions for the final "data", and then 
run Clumpfind on them. They show that for the same region, 
Clumpfind identifies a different number of clumps and their 
derived properties are variable when using different spatial 
resolution. 

In the 2D case, we notice that the exponent does not vary 
significantly with stepsize. In addition, the exponent fluctua- 
tion is almost negligible when compared with the associated 
uncertainties. However, Figure [3] shows an example of how 
different the Clumpfind decomposition is for three different 
input parameters. By comparing different panels in Figure |3] 
we see that some structures appear or are split under different 
parameters. These subtle differences suggest that to create a 



reliable catalogue manual check is needed to ensure meaning - 
ful structures. Moreover, iKainulainen et al.l (l2009l) recently 
showed, using the Pipe MC extinction data, that the CMF can 
not be recovered in crowded cases. 

Clumpfind is also run on the noise-added '^CO data with 5- 
a threshold. The fitted power-law exponents for noise-added 
'■'CO structures are similar to those derived for the original 
'^CO data only for large stepsizes (~ 2 K). Also, only for 
large stepsizes the fitted power-law exponent is close to re- 
sults from previous studies of the structure in MCs. But, this 
should not be a surprise, since Clumpfind will assign the '^'CO 
extended emission into several clumps, and by adding noise 
the boundaries of these clumps are changed. This generates 
more less-massive clumps and also changes the slope of the 
power-law. However, there must be a point where Clumpfind 
identifies the largest structures in the cloud and the exponent 
should not change much for larger stepsize. We estimate that 
this effect must be less dramatic when the emission is sparse 
(e.g. SCUBA map, interferometer data or higher density 
tracer), because there is less room to change the boundaries 
and masses of the objects. Also, a diff erent structure iden- 
tification techniques, dendrogram (Ro solowsky et al.l 120081; 
iGoodman et al.l 120091) . that allows for hierarchical structure 
is already available and could be used to derive mass function 
of bound structures or any specific structure under considera- 
tion. 

6. SUMMARY 

The '^CO and 850 ^m maps of Perseus of the COMPLETE 
Survey are used to study the 2D and 3D versions of the 
Clumpfind algorithm, respectively. 

The total number of identified structures is highly corre- 
lated with the parameters used (threshold and/or stepsize). 
Decompositions run with a smaller threshold and stepsize pro- 
duce more objects. 

We use a new method to estimate the completeness limit 
for a sample of clumps. The mass spectrum of the identified 
structures, dN /dM, is fitted with a power-law above the com- 
pleteness limit. For the standard Clumpfind parameters, the 
mass function e xponent is clo ser to Salpeter than to the clas- 
sical result from iBlit^ (Il993h . Despite the small variation in 
the number of objects above the completeness limit, the fitted 
power-law exponent for '^CO structure is a strong function 
of the stepsize, while it is independent of the threshold used. 
The power-law exponent of SCUBA objects is also correlated 
with stepsize, but this effect is negligible compared to the as- 
sociated uncertainties from the fitting. The '^CO power-law 
exponent variation shows that the cloud structure changes as 
we approach smaller scales, and that Clumpfind is still a use- 
ful tool to study the structure of a Molecular Cloud or the 
difference between two regions. However, this also means 
that it is not possible to derive a single mass function describ- 
ing the sub-structure in molecular clouds when using a non- 
hierarchical decomposition. Most likely, a better way to study 
the structure in molecular clouds is by using some identifica- 
tion scheme that takes into account the hierarchical nature of 
these regions (e.g. dendrograms). In such case, mass distri- 
bution functions of the bound material could be used as an 
observable. 
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Fig. 3. — Comparison of different Clumpfind runs on the NGC1333 region. Panels b, c and d show clumps found in three different Clumpfind runs, and crosses 
mark the position of cores found by Kirk et al. (2006). Panel a shows the dust emission map in the NGC1333 region, with the overlaid contours at 1-, 3-, 5-, 
and 10-(T level; in addition the beam size (19.9") is shown in bottom left corner. Small changes in the parameters used generate small (but important) changes 
the catalogue obtained. 
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